TNO Hierarchical topic detection report at TDT 2004
نویسندگان
چکیده
Hierarchical topic detection is a new task in the TDT 2004 evaluation program, which aims to organize a collection of unstructured news data in a directed acyclic graph (DAG) structure, reflecting the topics discussed in the collection, ranging from rather coarse category like nodes to fine singular events. The HTD task poses interesting challenges since its evaluation metric is composed of a travel cost component reflecting the time to find the node of interest starting from the top node and a quality cost component, determined by the quality of the selected node. We present a scalable architecture for HTD and compare several alternative choices for agglomerative clustering and DAG optimization in order to minimize the HTD cost metric. The alternatives are evaluated on the TDT3 and TDT5 test collections.
منابع مشابه
Results of the 2003 Topic Detection and Tracking Evaluation
The National Institute of Standards and Technology (NIST) administered the sixth open evaluation of Topic Detection and Tracking (TDT) technologies in November of 2003. The TDT project supports development of technologies that automatically organize eventrelated news stories. The program leverages expertise in core technologies, Automatic Speech Recognition (ASR), Document Retrieval (DR), and M...
متن کاملHierarchical Topic Detection in TDT-2004
Huge volume of news makes it hard for people to keep up with the latest information, and automatic processing of news information becomes necessary. Topic Detection and Tracking is a research program that deals with this problem. From the observations in TDT, news topics can be described in different sizes, making it hard to define the “correct” granularity. In TDT-2004, the topic detection tas...
متن کاملUsing language models for tracking events of interest over time
This paper presents the TNO tracking system which was evaluated at the 2000 Topic Detection and Tracking evaluation project (TDT2000). The objective of the TDT tracking task is to track events of interest over time. We built a baseline tracking system based on a language modeling approach. This approach had proved to be powerful for the TREC adaptive filtering task and several other IR tasks.
متن کاملTdt-2004: Adaptive Topic Tracking at Maryland
A topic tracking system that combines elements from vector space and language modeling frameworks to compute document scores is described. The model is used for both the traditional TDT topic tracking evaluation design and the new supervised adaptive topic tracking evaluation. Results indicate that supervised adaptation and score normalization should be more closely coupled, and that current te...
متن کاملUnsupervised Event Clustering in Multilingual News Streams
Abstract The Topic Detection and Tracking (TDT) benchmark evaluation project embraces a variety of technical challenges for information retrieval research. The TDT topic detection task is concerned with the unsupervised grouping of news stories according to the events they discuss. A detection system must both discover new events as the incoming stories are processed and associate incoming stor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004